A Personalized Document Clustering Approach to Addressing Individual Categorization Preferences

نویسندگان

  • Chih-Ping Wei
  • Roger H.L. Chiang
  • Chia-Chen Wu
چکیده

As electronic commerce and knowledge economy environments proliferate, both individuals and organizations increasingly generate and consume large amounts of online information, typically available as textual documents. To manage this ever-increasing volume of documents, such individuals and organizations frequently organize their documents into categories that facilitate document management and subsequent information access and browsing. However, document clustering is an intentional act that reflects individual preferences with regard to the semantic coherency and relevant categorization of documents. Hence, effective document clustering must consider individual preferences and needs to support personalization in document categorization. In this paper, we present an automatic document clustering approach that incorporates an individual’s partial clustering as preferential information. Combining two document representation methods with two clustering processes, we establish four personalized document clustering techniques. Using a traditional content-based document clustering technique as a performance benchmark, we find that the proposed personalized document clustering techniques improve clustering effectiveness, as measured by cluster precision and cluster recall.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalized Document Clustering: A Collaborative-Filtering-Based Approach

To manage the ever-increasing volume of documents, individuals and organizations frequently organize their documents into categories that facilitate document management and subsequent information access and browsing. However, document clustering is intentional acts that reflect individual preferences with regard to the semantic coherency and relevant categorization of documents. Hence, an effec...

متن کامل

Improving news articles recommendations via user clustering

Although commonly only item clustering is suggested by Web mining techniques for news articles recommendation systems, one of the various tasks of personalized recommendation is categorization of Web users. With the rapid explosion of online news articles, predicting user-browsing behavior using collaborative filtering (CF) techniques has gained much attention in the web personalization area. H...

متن کامل

Contextual Document Clustering

In this paper we present a novel algorithm for document clustering. This approach is based on distributional clustering where subject related words, which have a narrow context, are identified to form metatags for that subject. These contextual words form the basis for creating thematic clusters of documents. We believe that this approach will be invaluable in creating an information retrieval ...

متن کامل

User Personalization via W-kmeans

With the rapid explosion of online news articles, predicting userbrowsing behavior using collaborative filtering techniques has gained much attention in the web personalization area. However, common collaborative filtering techniques suffer from low accuracy and performance. This research proposes a new personalized recommendation approach that integrates user and text clustering based on our d...

متن کامل

Personalized User Preference Mining from Weblogs by Agglomerative Concept Clustering

––Current web search engines are built to serve all users, independent of the needs of any individual user. Search Engine personalization is to carry out retrieval for each user incorporating his/her interests based on the user profiles. Although personalized search has been proposed for many years and many personalization strategies have been investigated, it is still unclear whether personali...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007